Performance evaluation results of evolutionary clustering algorithm star for clustering heterogeneous datasets

نویسندگان

چکیده

This article presents the data used to evaluate performance of evolutionary clustering algorithm star (ECA*) compared five traditional and modern algorithms. Two experimental methods are employed examine ECA* against genetic for clustering++ (GENCLUST++), learning vector quantisation (LVQ), expectation maximisation (EM), K-means++ (KM++) K-means (KM). These algorithms applied 32 heterogenous multi-featured datasets determine which one performs well on three tests. For one, ther paper examines efficiency in contradiction its corresponding using evaluation measures. validation criteria objective function cluster quality another, it suggests a rating framework measurethe sensitivity these varos dataset features (cluster dimensionality, number clusters, overlap, shape structure). The contributions experiments two-folds: (i) exceeds counterpart aloriths ability find out right number; (ii) is less sensitive towards competitive techniques. Nonetheless, results performed demonstrate some limitations ECA*: not fully based premise that no prior knowledge exists; Adapting utilising several real applications has been achieved yet.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Optimization K-Modes Clustering Algorithm with Elephant Herding Optimization Algorithm for Crime Clustering

The detection and prevention of crime, in the past few decades, required several years of research and analysis. However, today, thanks to smart systems based on data mining techniques, it is possible to detect and prevent crime in a considerably less time. Classification and clustering-based smart techniques can classify and cluster the crime-related samples. The most important factor in the c...

متن کامل

Clustering Heterogeneous Semi-structured Social Science Datasets

Social scientists have begun to collect large datasets that are heterogeneous and semi-structured, but the ability to analyze such data has lagged behind its collection. We design a process to map such datasets to a numerical form, apply singular value decomposition clustering, and explore the impact of individual attributes or fields by overlaying visualizations of the clusters. This provides ...

متن کامل

Parallel Algorithm for Extended Star Clustering

In this paper we present a new parallel clustering algorithm based on the extended star clustering method. This algorithm can be used for example to cluster massive data sets of documents on distributed memory multiprocessors. The algorithm exploits the inherent data-parallelism in the extended star clustering algorithm. We implemented our algorithm on a cluster of personal computers connected ...

متن کامل

An Evolutionary Data Clustering Algorithm

Data mining is the process of deriving knowledge from data. The data clustering is a classical activity in data mining. Clustering is the process of grouping objects together in such a way that the objects belonging to the same group are similar and those belonging to different groups are dissimilar. In this paper we propose a method to carry out data clustering using Evolutionary Computation. ...

متن کامل

A Mutually Supervised Ensemble Approach for Clustering Heterogeneous Datasets

We present an algorithm to address the problem of clustering two contextually related heterogeneous datasets that use different feature sets, but consist of non-disjoint sets of objects. The method is based on clustering the datasets individually and then combining the resulting clusters. The algorithm iteratively refines the two sets of clusters using a mutually supervised approach to maximize...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Data in Brief

سال: 2021

ISSN: ['2352-3409']

DOI: https://doi.org/10.1016/j.dib.2021.107044